44 research outputs found
A Kalman Filter based Low Complexity Throughput Prediction Algorithm for 5G Cellular Networks
Throughput Prediction is one of the primary preconditions for the
uninterrupted operation of several network-aware mobile applications, namely
video streaming. Recent works have advocated using Machine Learning (ML) and
Deep Learning (DL) for cellular network throughput prediction. In contrast,
this work has proposed a low computationally complex simple solution which
models the future throughput as a multiple linear regression of several present
network parameters and present throughput. It then feeds the variance of
prediction error and measurement error, which is inherent in any measurement
setup but unaccounted for in existing works, to a Kalman filter-based
prediction-correction approach to obtain the optimal estimates of the future
throughput. Extensive experiments across seven publicly available 5G throughput
datasets for different prediction window lengths have shown that the proposed
method outperforms the baseline ML and DL algorithms by delivering more
accurate results within a shorter timeframe for inferencing and retraining.
Furthermore, in comparison to its ML and DL counterparts, the proposed
throughput prediction method is also found to deliver higher QoE to both
streaming and live video users when used in conjunction with popular Model
Predictive Control (MPC) based adaptive bitrate streaming algorithms.Comment: 13 pages, 14 figure
Manifold-Preserving Transformers are Effective for Short-Long Range Encoding
Multi-head self-attention-based Transformers have shown promise in different
learning tasks. Albeit these models exhibit significant improvement in
understanding short-term and long-term contexts from sequences, encoders of
Transformers and their variants fail to preserve layer-wise contextual
information. Transformers usually project tokens onto sparse manifolds and fail
to preserve mathematical equivalence among the token representations. In this
work, we propose TransJect, an encoder model that guarantees a theoretical
bound for layer-wise distance preservation between a pair of tokens. We propose
a simple alternative to dot-product attention to ensure Lipschitz continuity.
This allows TransJect to learn injective mappings to transform token
representations to different manifolds with similar topology and preserve
Euclidean distance between every pair of tokens in subsequent layers.
Evaluations across multiple benchmark short- and long-sequence classification
tasks show maximum improvements of 6.8% and 5.9%, respectively, over the
variants of Transformers. Additionally, TransJect displays 79% better
performance than Transformer on the language modeling task. We further
highlight the shortcomings of multi-head self-attention from the statistical
physics viewpoint. Although multi-head self-attention was incepted to learn
different abstraction levels within the networks, our empirical analyses
suggest that different attention heads learn randomly and unorderly. In
contrast, TransJect adapts a mixture of experts for regularization; these
experts are more orderly and balanced and learn different sparse
representations from the input sequences. TransJect exhibits very low entropy
and can be efficiently scaled to larger depths.Comment: 17 pages, 7 figures, 5 tables, Findings of the Association for
Computational Linguistics: EMNLP202
Persona-aware Generative Model for Code-mixed Language
Code-mixing and script-mixing are prevalent across online social networks and
multilingual societies. However, a user's preference toward code-mixing depends
on the socioeconomic status, demographics of the user, and the local context,
which existing generative models mostly ignore while generating code-mixed
texts. In this work, we make a pioneering attempt to develop a persona-aware
generative model to generate texts resembling real-life code-mixed texts of
individuals. We propose a Persona-aware Generative Model for Code-mixed
Generation, PARADOX, a novel Transformer-based encoder-decoder model that
encodes an utterance conditioned on a user's persona and generates code-mixed
texts without monolingual reference data. We propose an alignment module that
re-calibrates the generated sequence to resemble real-life code-mixed texts.
PARADOX generates code-mixed texts that are semantically more meaningful and
linguistically more valid. To evaluate the personification capabilities of
PARADOX, we propose four new metrics -- CM BLEU, CM Rouge-1, CM Rouge-L and CM
KS. On average, PARADOX achieves 1.6 points better CM BLEU, 47% better
perplexity and 32% better semantic coherence than the non-persona-based
counterparts.Comment: 4 tables, 4 figure
Domain adaptation based transfer learning approach for solving PDEs on complex geometries
In machine learning, if the training data is independently and identically distributed as the test data then a trained model can make an accurate predictions for new samples of data. Conventional machine learning has a strong dependence on massive amounts of training data which are domain specific to understand their latent patterns. In contrast, Domain adaptation and Transfer learning methods are sub-fields within machine learning that are concerned with solving the inescapable problem of insufficient training data by relaxing the domain dependence hypothesis. In this contribution, this issue has been addressed and by making a novel combination of both the methods we develop a computationally efficient and practical algorithm to solve boundary value problems based on nonlinear partial differential equations. We adopt a meshfree analysis framework to integrate the prevailing geometric modelling techniques based on NURBS and present an enhanced deep collocation approach that also plays an important role in the accuracy of solutions. We start with a brief introduction on how these methods expand upon this framework. We observe an excellent agreement between these methods and have shown that how fine-tuning a pre-trained network to a specialized domain may lead to an outstanding performance compare to the existing ones. As proof of concept, we illustrate the performance of our proposed model on several benchmark problems. © 2022, The Author(s)